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DETAILED ACTION 

This action is responsive to application 10/628546 filed on 7/28/2003. Claims 1 
and 18-23 have been examined. The previous office actions have been withdrawn. 

Claim Objections 

. Claims 1, 2, 7, 11-15, and 23 are objected to because of the following informalities: 
- The phrase "and/or" is not clear in scope. Examiner suggests "or". 
Appropriate correction is required. 

Claim Rejections - 35 USC §112 
The following is a quotation of the first paragraph of 35 U.S.C. 112: , 

The specification shall contain a written description of the invention, and of the manner and process of making 
and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it 
pertains, or with which it is most nearly connected, to make and use the same and shall set forth the best mode 
contemplated by the inventor of carrying out his invention. 

Claims 1-9, 11-16, and 18-23 are rejected under 35 U.S.C. 1 12, first paragraph, as failing 
to comply with the written description requirement. The claim(s) contains subject matter which 
was not described in the specification in such a way as to reasonably convey to one skilled in the 
relevant art that the inventor(s), at the time the application was filed, had possession of the 
claimed invention. Claims 1, 11, 14, and 21 describe non-standardized data being scored based 
on shifting and scaling the data, however the data is only virtually shifted, not shifted and scaled. 
The disclosure does not describe a way in which the data can only be virtually shifted, and not 
virtually scaled to achieve a score. Claims 2-9, 12-13, 15-16, 18-20, and 22-23 are rejected based 
on their dependency to claims 1, 11, 14, and 21. 

Claim 21 recites a "first data field describing a non-standardized set or subset of data" 
and "a second data field describing a decision tree and associated branches". These fields are 
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apart of a tangible medium with a data structured stored thereon. It is unclear how this tangible 
medium with a data structure would be able describe an entire decision tree and its associated 
branches in a singular "field". There is no description in the disclosure indicates this data 
structure with three data fields is functional. Claims 22 and 23 are rejected based on their 
dependency to claim 21. 

Claim Rejections - 35 USC § 101 

35 U.S.C. 101 reads as follows: 

Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or 
any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and 
requirements of this title. 

Claims 1-9, 11-16, and 18-23 are rejected under 35 U.S.C. 101 because the claimed 
invention is directed to non-statutory subject matter. 

In determining whether the claim is for a "practical application," the focus is not on 
whether the steps taken to achieve a particular result are useful, tangible, and concrete, but rather 
that the final result achieved by the claimed invention is useful, tangible and concrete. If the 
claim is directed to a practical application of the §101 judicial exceptions producing a result tied 
to the physical world that does not preempt the judicial exception, then the claim meets the 
statutory requirement of 35 U.S.C. §101. 

The claims are a manipulation of abstract concepts and are not clear in purpose or scope. 
Variations on the phrases in the claims, such as 'virtually shifted through omission of a. matrix 
operation' do not provide a clear purpose or scope for the claimed invention. 

The invention must be for a practical application and either: 
1) specify transforming (physical thing - article) or 
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2) have the Final Result (not the steps) achieve or produce a 
useful (specific, substantial, AND credible), 
concrete (substantially repeatable/non-unpredictable), AND 
tangible (real world/non-abstract) result 

(tangibility is the opposite of abstractness). 
A claim that is so broad that it reads on both statutory and non-statutory subject matter 
must be amended. 

In the present case, claims 1-9, 11-16, and 18-23 preempt a wide variety of data mining 
using decision tree learning. Data mining is the process of identifying commercially useful 
patterns or relationships in databases or other computer repositories. Data mining is not a 
practical application, but rather a technique which can be employed for practical application, 
such as implementing data mining for understanding a consumer grocery purchases, data mining 
for monitoring the efficiency of a website's navigation, data mining for pattern recognition in 
images, data mining for diagnosis of medical illnesses, etc. 

The courts have also held that a claim may not preempt ideas, laws of nature or natural 
phenomena. The concern over preemption was expressed as early as 1852. See Le Roy v. 
Tatham . 55 U.S. (14 How.) 156, 175 (1852) ("A principle, in the abstract, is a fundamental truth; 
an original cause; a motive; these cannot be patented, as no one can claim in either of them an 
exclusive right."); Funk Bros. Seed Co. v. Kalo Inoculant Co.. 333 U.S. 127, 132, 76 USPQ 280, 
282(1948). 

Accordingly, one may not patent every "substantial practical application" of an idea, law 
of nature or natural phenomena because such a patent "in practical effect would be a patent on 
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the [idea, law of nature or natural phenomena] itself." "Here the "process" claim is so abstract 
and sweeping as to cover both known and unknown uses of the BCD to pure-binary conversion. 
The end use may (1) vary from the operation of a train to verification of drivers' licenses to 
researching the law books for precedents and (2) be performed through any existing machinery 
or future-devised machinery or without any apparatus " Gottschalk v. Benson, 409 U.S. 63, 71- 
72, 175 USPQ 673, 676 (1972), 

The Courts have found that subject matter that is not a practical application or use of an 
idea, a law of nature or a natural phenomenon is not patentable. As the Supreme Court has made 
clear, " [a]n idea of itself is not patentable," Rubber-Tip Pencil Co. v. Howard, 20 U.S. (1 
Wall.) 498, 507 (1874); taking several abstract ideas and manipulating them together adds 
nothing to the basic equation. In re Warmerdam, 31 USPQ2d 1754 (Fed. Cir. 1994). 

Claims that score a split in data are not statutory without a final output that is useful, 
concrete, and tangible. Claims 1-9, 11-16, and 18-23 do not provide an output. The scoring of a 
split is simply a manipulation of data, and therefore abstract. 

Appropriate corrections are required. 

Claim Rejections - 35 USC §103 
The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject niatter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time, the invention was made to a person 
having ordinaiy skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 
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Claim 1-3, 5-7, 11-15, 19-23 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Chickering et al. ("Efficient Determination of Dynamic Split Points in a Decision Tree", 
1997) and further in view of Riskin et al. ("Lookahead in Growing Tree- Structured Vector 
Quantizers", 1 99 1 ) hereafter referred to as Riskin. 
Claim 1,11,21 

Heckerman discloses a system that facilitates decision tree learning, comprising: 

a learning component that generates non-standardized data (see e.g, 1. Introduction, 2. 

Background and Notation; EN: Data has not been standardized) that relates to a split in a 

decision tree (predictor values, see e.g, 1. Introduction, especially P 92, C 1, where predictor 

values identify potential split points; 2. Background and Notation; 3. Some Efficient 

Discretization Methods, especially 'identifying split points'); and 

a scoring component that scores the split as if the non-standardized data at a subset of 

leaves of the decision tree had been shifted and/or scaled (see e.g., 1 . Introduction, especially p 

92, C 1; 2. Background and Notation, especially p 92-93), 

Chickering does not specifically disclose the non-standardized data virtually shifted through 
omission of a matrix operation. 

However, Riskin teaches the non-standardized data virtually shifted through omission of a 
matrix operation (lookahead, see e.g., IV. Lookahead in Growing Trees, especially p 2290 C 2, " 
... it looks ahead to a depth of one in the tree to measure the resulting slope of a decrease in 
distortion to increase in rate before it splits a candidate node", EN: the lookahead performs a 
measurement of a possible future node, such as shifting, but does not complete the shift. This 
reads on "virtually shift"). 
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It would have been obvious to one of ordinary skill in the art at the time the invention was made 
to combine the teachings of Chickering with Riskin. One would have been motivated to do so 
because the "lookahead" is a commonly known and used technique used in algorithms for 
efficiency since the lookahead technique relieves the need for pruning of the tree since it is using 
measurements and scores of candidate nodes. Lookahead reduces the complexity in the growing 
process and increases the efficiency of the growing (Riskin). 
Claim 2, 13, 15, 23 

Chickering discloses the system of claim 1, further comprising a modification component that 
for a respective candidate split score, the data is modified by shifting and/or scaling the data and 
a new score is computed on the modified data (scale, or standardize, see e.g., L Introduction, 2. 
Background and Notation; 4. Experiments, especially "standardized the . . . data so that the target 
had mean zero and standard deviation one", EN: data is shifted and scaled to standardize the 
data). 

Claim 3, 22 

Chickering discloses the system of claim 1, further comprising an optimization component that 
analyzes the data and decides to treat the data as if it was: (1) shifted, (2) scaled, or (3) shifted 
and scaled (scale, or standardize, see e.g., 1. Introduction, especially "scale linearly"; 2. 
Background and Notation; 4. Experiments, especially "standardized the . . . data so that the target 
had mean zero and standard deviation one", EN: data is shifted and scaled to standardize the 
data). 
Claim 5 
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Chickering discloses the system of claim 1, the learning component processes continuous 
variable data or data subsets (continuous, see e.g., 1. Introduction; 2. Background and Notation; 
4. Experiments). 
Claim 6 

Chickering discloses the system of claim 1, the scoring component generates evaluation 
indicating how well a model predicts continuous target data and whether or not the model is a 
suitable predictor for the target data (predictive accuracy, see e.g., 4. Experiments, EN: a tree 
models the data). 
Claim 7 

Chickering discloses the system of claim 6, the evaluation data is employed by users and/or 
subsequent automated components (see e.g., 4. Experiments; EN: experiment used datasets from 
the UC Irvine repository on a computer, where a computer is a 'automated component') when 
determining model performance and/or selecting between models or model subsets (evaluate 
performance, see e.g., 4. Experiments). 
Claim 12 

Chickering discloses the system of claim 1 1, further comprising means for determining whether 
to perform the shifting and/or scaling operations (see e.g., 2. Background and Notation; 3. Some 
Efficient Discretization Methods; EN: split points are examined and it is determined if they are 
suitable for approximation, by which they are then shifted and scaled to reach a desired 
distribution). 
Claim 14 

Chickering discloses a method that facilitates decision tree learning, comprising: 
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determining whether to perform a virtual shifting and/or scaHng operation on a non- 
standardized set of data associated with leaves of a decision tree (see e.g., 2. Background and 
Notation; 3. Some Efficient Discretization Methods; EN: split points are examined and it is 
determined if they are suitable for approximation, by which they are then shifted and scaled to 
reach a desired distribution); and 

automatically assigning scores to the leaves based in part upon the determination of 
whether to perform the virtual shifting and/or scaling operation (scoring criteria, see e.g., 2. 
Background and Notation, especially). 

However, Riskin teaches the non-standardized data virtually shifted through omission of a 
matrix operation (lookahead, see e.g., IV. Lookahead in Growing Trees, especially p 2290 C 2, " 
... it looks ahead to a depth of one in the tree to measure the resulting slope of a decrease in 
distortion to increase in rate before it splits a candidate node", EN: the lookahead performs a 
measurement of a possible future node, such as shifting, but does not complete the shift. This 
reads on "virtually shift"). 

It would have been obvious to one of ordinary skill in the art at the time the invention was made 
to combine the teachings of Chickering with Riskin. One would have been motivated to do so 
because the "lookahead" is a commonly known and used technique used in algorithms for 
efficiency since the lookahead technique relieves the need for pruning of the tree since it is using 
measurements and scores of candidate nodes, Lookahead reduces the complexity in the growing 
process and increases the efficiency of the growing (Riskin). 
Claim 19 
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Chickering discloses the method of claim 14, determining at least one constant value before 
assigning the scores (k, see e.g., 3. Some Efficient Discretization Methods; 4. Experiments). 
Claim 20 

Chickering discloses the method of claim 19, the constant value relates to diagonal elements of 
a matrix and is assigned a value of about 0.01 (k = 0.01, see e.g., 4. Experiments, EN: a matrix is 
merely a data structure which elements may have a diagonal value of 0.01 since it is not defined 
what is in the matrix). 

Claim Rejections - 35 USC §103 
The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the inwntion is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

Claim 4 is rejected under 35 U.S.C. 103(a) as being unpatentable over Chickering and Riskin 

and further in view of Heckerman (Bayesian Networks for Data Mining, 1997), 

Claim 4 

Chickering and Riskin do not disclose the system of claim 1, the scoring component is 
employed for evaluating a data mining application. 

However Heckerman teaches the system of claim 1, the scoring component is employed for 
evaluating a data mining application (data mining, see e.g., Abstract, 1. Introduction). 
It would have been obvious to one of ordinary skill in the art at the time the invention was made 
to combine the teachings of Chickering and Riskin with Heckerman. One would have been 
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motivated to do so because Bayesian networks, which use a scoring component can readily 
handle incomplete data sets, allow one to learn about casual relationships, and in conjunction 
with Bayesian statistical techniques facilitate the combination of domain knowledge and data 
(Heckerman). 

Claim Rejections - 35 USC §103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the inwntion is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

Claims 8, 9, 16 and 18 is rejected under 35 U.S.C. 103(a) as being unpatentable over Chickering 
and Riskin and further in view of Minka (Bayesian linear regression, 1999). 
Claim 8 

Chickering and Riskin disclose the system of claim 1, the scoring component includes at least 
one of a data sample processor (see e.g., 4. Experiments; EN: experiment used data samples from 
the UC Irvine repository on a coniputer, where a computer contains a processor), and a mean 
value for data or a data subset (see mean, e.g., 3, Some Efficient Discretization Methods; 4. 
Experiments). 

Chikering and Riskin do not specifically disclose a scoring constant, a gamma function, a 
matrix value, and a vector value. 

However Minka discloses a scoring constant (V, see e.g., 1. Introduction, EN: Jeffrey's prior has 
invariant X and invariant V; 2. Known V, EN: V is known and constant), a gamma function (see 
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e.g., I. Introduction, especially "F'), a matrix value (matrix A, see e.g., 1. Introduction), and a 
vector value (vector v, see e.g, 1. Introduction). 

It would have been obvious to one of ordinary skill in the art at the time the invention was made 
to combine the teachings of Chickering and Riskin with Minka. One would have been 
motivated to do so because Chickering describes the benefit and use of Bayesian scoring 
criterion, which avoids over-fitting by penalizing model complexity. Bayesian linear regression 
helps avoid over-fitting (Minka). 

Claim 9 

Chickering and Riskin do not specifically disclose the system of claim 1, the scoring 
component computes a Bayesian linear regression score as: 



denotes determinant, n represents a number of records in the data, P is a gamma 
function satisfying r(x) - (x-l) r(x-i), jt, denotes a vector of values for relevant 
variables in:an ith case in the data, the superscripts TR and R in T" and T" denote 
that the matrices are defmed v^th respect to target and regressor variables in a first 
case and regressor variables in a second case. 



However Heckerman teaches the system of claim 1, the scoring component computes a 
Bayesian linear regression score as: 





wherein \.\ represents a mean, g denotes a degree of freedom. 8 connotes a pre- 
defined constant/ bold>facc symbols denote square matrices, symbols with overlines 
denote (one dimensional) vectors, the [[*]] symbol denotes transpose, and | | 
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wherein u represents a mean, a denotes a degree of freedom. S connotes a pre- 
defined constant, bold^facc symbols denote square matrices, symbols with overlines 
denote (one dimensional) vectors, the ' ([*]] symbol denotes transpose, and | | 
denotes determinant, n represents a number df records in the data, V is a gamma 
function satisfying r(x) = (x-1) r(x-l ), x. denotes a vector of values for relevant 
variables in an Uh case in the data, the superscripts TR and R in Tj^ and T„*^ denote 
that the matrices are defined with respect to target and regressor variables in a first 
case and regressor variables in a second case. 

(see e.g., 1. Introduction, especially equations 11, 12, 14; 2.1 Model selection via the evidence, 
especially equations 27, 29; EN: the "score" is merely a Bayesian linear regression, based on a 
matrix with a Wishart distribution and is anticipated because it achieves the same goal) 
It would have been obvious to one of ordinary skill in the art at the time the invention was made 
to combine the teachings of Chickering and Riskin with Minka. One would have been 
motivated to do so because Chickering describes the benefit and use of Bayesian scoring 
criterion, which avoids over-fitting by penalizing model complexity. Bayesian linear regression 
helps avoid over-fitting (Minka). 
Claim 16 

Chickering and Riskin do not specifically disclose the method of claim 14, further comprising 
processing a model in a form of a linear regression. 

However Heckerman teaches the method of claim 14, further comprising processing a model in 
a form of a Hnear regression (equations 1, 2, 3, see e.g., 1. Introduction). 
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It would have been obvious to one of ordinary skill in the art at the time the invention was made 
to combine the teachings of Chickering and Riskin with Minka. One would have been 
motivated to do so because linear regression is a common technique and helps avoid over-fitting 
(Minka). 
Claim 18 

Chickering and Riskin do not specifically disclose the method of claim 14, the virtual shifting 
operation includes modifying a subset of elements relating to a covariance matrix. 
However Heckerman teaches the method of claim 14, the virtual shifting operation includes 
modifying a subset of elements relating to a covariance matrix (equations 5, see e.g., 1, 
Introduction; 2. Known V; 2.1 Model selection via the evidence). 

It would have been obvious to one of ordinary skill in the art at the time the invention was made 
to combine the teachings of Chickering and Riskin with Minka. One would have been 
motivated to do so because a covariance matrix is often used in linear regression and is simply a 
larger data structure which holds the covariances of a scalar vector. 

Conclusion 

The prior art of record and not relied upon is considered, pertinent to the applicant's 
disclosure. 

- Tal et al. (Patent No. 6532457) 

- Bernhardt et al . (Pub No. 2004/0002879) 

- Yaung (Pub No. 2003/0023662) 
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Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Melissa Herman whose telephone number is 571-270-1393, The 
examiner can normally be reached on 9/4/5. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, David Vincent can be reached on 571-272-3080. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would 
like assistance from a USPTO Customer Service Representative or access to the automated 
information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 



Melissa Herman 



MB 




